Black Population and Unaffordable Housing in SeattleΒΆ
SummaryΒΆ
This project aims to look at the unaffordable housing situation in Seattle, WA. This topic has come up many times in previous classes of mine and I wanted to see the problem visually while also diving deeper into what parts of Seattle are affected the most and how a specific minority group is affected differently. I originally wanted to look at every minority group to compare them but ran out of time so only ended up looking and black ethnic groups.
Research QuestionsΒΆ
- What are the census tracts that have the highest black population percentage?
- What census tracts struggle the most with affordable housing?
- How does affordable housing affect the black community in Seattle, WA?
MotivationΒΆ
Gentrification in Seattle is not new. In fact, it still persists today as investors drive out black families in order to "improve" a poor urban neighborhood. I chose this project to understand what the situation is geographically and what areas are being hit harder than others. I think this project can help policy makers and interest groups to help build an argument to put a stop to gentrification.
Data SettingΒΆ
ACS Data: This data is collected annually by the Census Bureau and gives estimates for nearly every Census Tract in the United States about their population, age, sex, etc. I will mainly be using the population estimate columns for this project. This data was collected on data.census.gov and is the DP05 table. I selected Seattle, Washington on the website and downloaded the CSV because if I downloaded the entire nation's data, my computer would crash.
Affordable Housing Data: This data was collected from ESRI's ARCGIS Washington open data. This dataset has data for almost every census tract in Washington state and pertains to the affordable housing. The percentage it gives is the percentage of people who breach the 30% rule for housing. The 30% rule states that in order to have a financially stable household, you should spend no more than 30% of your income on housing.
MethodsΒΆ
To answer the first research question:ΒΆ
- Read and join the three datasets
- Remove non-numeric columns or convert to take care of any Type-Errors
- Rename columns
- Calculate Mean Black Percentage for each Census Tract
- Create a map showcasing the Mean Black Percentage
To answer the second research question:ΒΆ
- Use the Percentage column to create Severiry Ranking Buckets through quantile binning
- Map the severity rankings for every census tract on Folium Map
To answer the third research question:ΒΆ
- Plot the Mean Black Percentage
- Highlight the census tracts that are in the highest severity rankings (4 & 5)
- Create a bar graph to plot the mean black percentage of each ranking.
- Plot the mean black percentage for all of Seattle
sources = [
"https://data.census.gov/table?q=dp05&g=040XX00US53,53$1400000",
"https://data-wa-geoservices.opendata.arcgis.com/datasets/WADOH::unaffordable-housing-current-version/about",
"mapping.ipynb",
"https://realpython.com/python-folium-web-maps-from-data/",
"https://python-visualization.github.io/branca/colormap.html",
"https://python-visualization.github.io/folium/latest/getting_started.html"
"https://plotly.com/python-api-reference/plotly.express.html",
"https://plotly.com/python-api-reference/plotly.graph_objects.html",
"https://stackoverflow.com/questions/75462763/plotly-graph-not-showing-up-in-rendered-html",
"https://plotly.com/python/renderers/",
"https://plotly.com/python/creating-and-updating-figures/",
"https://plotly.com/python/builtin-colorscales/",
"https://plotly.com/python/colorscales/",
"https://python-visualization.github.io/folium/latest/user_guide/geojson/geojson.html",
"https://edstem.org/us/courses/56782/discussion/5002658",
"https://plotly.com/python/text-and-annotations/",
"https://plotly.com/python/reference/layout/annotations/",
"https://python-visualization.github.io/folium/latest/user_guide/geojson/geojson_popup_and_tooltip.html",
]
import pandas as pd
import geopandas as gpd
import numpy as np
import folium
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
from branca.colormap import LinearColormap
pio.renderers.default = 'notebook'
# Initialize
census_df = pd.read_csv("ACSDP5Y2019.DP05-Data.csv", low_memory=False)
housing_df = pd.read_csv("Unaffordable_Housing_(Current_Version).csv")
# Selecting desired columns
census_df = census_df[['GEO_ID', 'NAME', 'DP05_0063E', 'DP05_0064E', 'DP05_0065E', 'DP05_0066E', 'DP05_0067E', 'DP05_0068E', 'DP05_0069E', 'DP05_0086E']]
# Rename
census_df = census_df.rename(columns={
'GEO_ID': 'Census_Tract',
'NAME': 'Census_Name',
'DP05_0063E': 'totalPop',
'DP05_0064E': 'whitePop',
'DP05_0065E': 'blackPop',
'DP05_0066E': 'nativePop',
'DP05_0067E': 'asianPop',
'DP05_0068E': 'pacificPop',
'DP05_0069E': 'otherPop',
'DP05_0086E': 'total_units'
})
# Convert 'Census_Tract' to float for type errors
census_df = census_df[pd.to_numeric(census_df['Census_Tract'], errors='coerce').notnull()]
census_df['Census_Tract'] = census_df['Census_Tract'].astype(float)
# Convert selected columns to numeric for type errors
renamed = ['totalPop', 'whitePop', 'blackPop', 'nativePop', 'asianPop', 'pacificPop', 'otherPop', 'total_units']
census_df[renamed] = census_df[renamed].apply(pd.to_numeric, errors='coerce')
# Join csv's
df = housing_df.merge(census_df, on='Census_Tract', how='left')
# Read GEOJSON
geo_json = gpd.read_file("Census_Tracts_2010.geojson")
geo_json['GEOID10'] = geo_json['GEOID10'].astype(float)
# Join with already merged csv's
df = geo_json.merge(df, left_on='GEOID10', right_on='Census_Tract', how='left')
df = df.dropna(subset=['TRACT'])
# Find black population percentage for every census tract and the overall mean for Seattle
df['Percentage_Black'] = (df['blackPop'] / df['totalPop']) * 100
mean_black_percentage = df['Percentage_Black'].mean()
# SEVERITY RANKINGS
df['Percentage'] = pd.to_numeric(df['Percentage'], errors='coerce')
# QUANTILE BINNING FOR SEVERITY RANKINGS so there is an equal amount per bucket
df['Ranking'] = pd.qcut(df['Percentage'], q=5, labels=[1, 2, 3, 4, 5])
# Tests for dataframe
assert df['blackPop'].dtype == 'float64'
assert df['totalPop'].dtype == 'float64'
assert len(df) == 135
assert df.loc[df['Percentage'].idxmax(), 'NAMELSAD10'] == 'Census Tract 53.02', f"Expected {'Census Tract 53.02'}, but got {df.loc[df['Percentage'].idxmax(), 'NAMELSAD10']}"
expected_columns =[
"OBJECTID_x", "TRACT", "TRACTCE10", "GEOID10", "NAME10",
"NAMELSAD10", "ACRES_TOTAL", "WATER", "SHAPE_Length_x",
"SHAPE_Area_x", "geometry", "OBJECTID_y", "Census_Tract",
"IBL_Rank", "EHD_Rank", "Env_SEF_Rank", "Count_", "Population",
"Percentage", "Lower_ME", "Upper_ME", "SHAPE_Length_y", "SHAPE_Area_y",
"Census_Name", "totalPop", "whitePop", "blackPop", "nativePop", "asianPop",
"pacificPop", "otherPop", "total_units", "Percentage_Black", "Ranking"
]
assert list(df.columns) == expected_columns, f"Expected {expected_columns}, but got {df.columns.tolist()}"
m = folium.Map(location=[47.620564, -122.350616], zoom_start=11)
geojson_data = df.to_json()
# Map layer
folium.Choropleth(
geo_data=geojson_data,
name='choropleth',
data=df,
columns=['NAMELSAD10', 'Percentage_Black'],
key_on='feature.properties.NAMELSAD10',
fill_color='YlOrRd',
fill_opacity=0.7,
line_opacity=0.2,
legend_name='Percentage Black',
).add_to(m)
# Hover stuff
folium.GeoJson(
geojson_data,
style_function=lambda feature: {
'fillColor': 'transparent',
'color': 'black',
'weight': 1,
'fillOpacity': 0.1,
},
highlight_function=lambda feature: {
'fillColor': '#6690ffff',
'color': 'yellow',
'weight': 2,
'fillOpacity': 0.5,
},
tooltip=folium.features.GeoJsonTooltip(
fields=['NAMELSAD10', 'Percentage_Black'],
aliases=['Census Tract', 'Percentage Black'],
labels=True,
sticky=True,
)
).add_to(m)
m
# Map 2
m2 = folium.Map(location=[47.620564, -122.350616], zoom_start=11)
# Map layer for Severity Ranking
folium.Choropleth(
geo_data=geojson_data,
name='choropleth',
data=df,
columns=['NAMELSAD10', 'Ranking'],
key_on='feature.properties.NAMELSAD10',
fill_color='YlGn',
fill_opacity=0.7,
line_opacity=0.2,
legend_name='Severity Ranking',
bins=[1, 2, 3, 4, 5, 6]
).add_to(m2)
# Hover stuff
folium.GeoJson(
geojson_data,
style_function=lambda feature: {
'fillColor': 'transparent',
'color': 'black',
'weight': 1,
'fillOpacity': 0.1,
},
highlight_function=lambda feature: {
'fillColor': '#6690ffff',
'color': 'yellow',
'weight': 2,
'fillOpacity': 0.5,
},
tooltip=folium.features.GeoJsonTooltip(
fields=['NAMELSAD10', 'Ranking', 'Percentage'],
aliases=['Census Tract', 'Severity Ranking', 'Percentage'],
labels=True,
sticky=True,
)
).add_to(m2)
m2
# Tests
m3 = folium.Map(location=[47.620564, -122.350616], zoom_start=11)
folium.Choropleth(
geo_data=geojson_data,
name='choropleth',
data=df,
columns=['NAMELSAD10', 'Percentage_Black'],
key_on='feature.properties.NAMELSAD10',
fill_color='BuPu',
fill_opacity=0.7,
line_opacity=0.2,
legend_name='Percentage Black',
).add_to(m3)
# Highlight Severity Ranking of 4 or 5
highlighted_tracts = df[df['Ranking'].isin([4, 5])]
highlight_layer = folium.FeatureGroup(name='Outlined Tracts')
folium.GeoJson(
highlighted_tracts.to_json(),
style_function=lambda feature: {
'fillColor': 'transparent',
'color': 'black',
'weight': 3,
'fillOpacity': 0,
},
highlight_function=lambda feature: {
'fillColor': 'transparent',
'color': 'yellow',
'weight': 3,
'fillOpacity': 0,
},
tooltip=folium.features.GeoJsonTooltip(
fields=['NAMELSAD10', 'Ranking', 'Percentage_Black'],
aliases=['Census Tract', 'Severity Ranking', 'Percentage Black'],
labels=True,
sticky=True,
)
).add_to(highlight_layer)
highlight_layer.add_to(m3)
folium.LayerControl().add_to(m3)
m3
Implications and LimitationsΒΆ
This project has major implications for policymakers and interest groups that advocate for equitable housing for minority groups. Here are the findings from my project:
- Census Tracts with the highest black population percentage are next to one another.
- Severity Rankings show that Census Tracts that have a high percentage of people that spend more than 30% of their income on housing tend to have higher black population percentages.
Limitations:
- The Census Data and Unaffordable Housing datasets are ONLY ESTIMATES. The real percentages and population may differ from the ones used in this project.
- The population percentages are changing every year. What is shown in this project may not be the same situation for next year. This data was used during with 2010 Census Tracts and 2019 Housing data.
- Since this data analysis aggregates data based on census tract, it obscures the variations that exist within census tracts, especially large ones (by area).
- Using Quantile Binning for the severity rankings may be unfair if there are outliers or a majorly skewed distribution. While there is one outlier (University of Washington's Census Tract, 53.02), the distribution of the percentage of people spending over 30% of their income on housing is somewhat normally distributed. Still, the limitations of this statistical method must be taken into consideration.